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Masahiro Imachi-'"*), 
Yasuhiko Shinno***-* and Hiroshi Yoneyama****^ 

Department of Physics, Yamagata University, Yamagata 990-8560, Japan)' 
Department of Physics, Saga University, Saga 840-8502, Japan^ 

■ In Monte Carlo simulations of lattice field theory with a 6 term, one confronts the 
' complex weight problem, or the sign problem. This is circumvented by performing the 
^ Fourier transform of the topological charge distribution P{Q). This procedure, however, 

causes flattening phenomenon of the free energy f(6), which makes study of the phase 
structure unfeasible. In order to treat this problem, we apply the maximum entropy method 
(MEM) to a Gaussian form of P{Q), which serves as a good example to test whether the 
' , MEM can be applied eff'ectively to the 9 term. We study the case with flattening as well 

0^ ■ as that without flattening. In the latter case, the results of the MEM agree with those 

obtained from the direct application of the Fourier transform. For the former, the MEM 
gives a smoother f{6) than that of the Fourier transform. Among various default models 
^ , investigated, the images which yield the least error do not show flattening, although some 

others cannot be excluded given the uncertainty related to statistical error. 

G^ ! SI. Introduction 

o. 
m ■ 

■ It is believed tiiat tiie 9 term could affect the dynamics at low energy and the 
vacuum structure of QCD, but it is known from experimental evidence that the value 
of 6 is strongly suppressed in Nature. Prom the theoretical point of view, the reason 
for this is not clear yet. Hence, it is important to study the properties of QCD 

D ! with the 9 term to clarify the structure of the QCD vacuum.^ For theories with 

' the 6 term, it has been pointed out that rich phase structures could be realized in 9 

^ ■ space. For example, the phase structure of the Z{N) gauge model was investigated 

^ i using free energy arguments, and it was found that oblique confinement phases could 

;h ' occur .'^ In CP''^"^ models, which have several dynamical properties in common with 

■ ■ QCD, it has been shown that a first-order phase transition exists at 9 = tt^^^ 

Although numerical simulation is one of the most effective tools to study non- 
perturbative properties of field theories, the introduction of the 9 term makes the 
Boltzmann weight complex. This makes it difficult to perform Monte Carlo (MC) 
simulations on a Euclidean lattice. This is the complex action problem, or the sign 
problem. In order to circumvent this problem, the following method is conventionally 
employed.^'^ The partition function Z{9) can be obtained by Fourier-transforming 
the topological charge distribution P{Q), which is calculated with a real positive 
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Boltzmann weight: 



/ [dzdz] 



-S+i9Q{z,z) 



where 

The measure [dzdz]Q in Eq. (|l-2j) is such that the integral is restricted to configura- 
tions of the field z with topological charge Q. Also, S represents the action. 

In the study of CP^^^ models, it is known that this algorithm works well for a 
small lattice volume V and in the strong coupling region 1^'^''^''^ As the volume 
increases or in the weak coupling region, however, this strategy too suffers from the 
sign problem for ~ vr. The error in P{Q) masks the true values of 2(9) in the 
vicinity of = vr, and this results in a fictitious signal of a phase transition.— This 
is called 'fiattening', because the free energy becomes almost fiat for 9 larger than 
a certain value. This problem could be remedied by reducing the error in P{Q)- 
This, however, is hopeless, because the amount of data needed to reduce the error 
to a given level increases exponentially with V. Recently, an alternative method has 
been proposed to circumvent the sign problem.^'^ 

Our aim in the present paper is to reconsider this problem from the point of view 
of the maximum entropy method (MEM).E3ni-E3niEl The MEM is weU known 
as a powerful tool for so-called ill-posed problems, where the number of parameters 
to be determined is much larger than the number of data points. It has been applied 
to a wide range of fields, such as radio astrophysics and condensed matter physics. 
Recently, spectral functions in lattice field theory have been widely studied by use 
of the MEM.^'^'^'I^ In the present paper, we are interested in whether the 
MEM can be applied effectively to the study of the 9 term and to what extent one 
can improve the fiattening phenomenon of the free energy. 

The MEM is based upon Bayes' theorem. It derives the most probable parame- 
ters by utilizing data sets and our knowledge about these parameters in terms of the 
probability. The probability distribution, which is called the posterior probability, 
is given by the product of the likelihood function and the prior probability. The 
latter is represented by the Shannon-Jaynes entropy, which plays an important role 
to guarantee the uniqueness of the solution, and the former is given by x^- It should 
be noted that artificial assumptions are not needed in the calculations, because the 
determination of a unique solution is carried out according to probability theory. 
Our task is to determine the image for which the posterior probability is maximized. 
In practice, however, it is difficult to find a unique solution in the huge configura- 
tion space of the image. In order to do so effectively, we employ the singular value 
decomposition (SVD). 

The flattening of the free energy is an inherent phenomenon in the Fourier 
transformation procedure. It is quite independent of the models used. We choose a 
Gaussian form of P{Q), which is realized in many cases, such as the strong coupling 
region of the CP^~^ model and the 2-d U(l) gauge model. Because the Gaussian 
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P{Q) can be analytically Fourier-transformed to 2(9), it provides a good example to 
investigate whether the MEM would be effective. For the analysis, we use mock data 
by adding noise to P{Q) in the cases with and without flattening. The most probable 
images of the partition function are obtained. The uncertainty of the images is used 
as an estimate of the error. 

Our conclusion is summarized as follows. 

1. In the case without flattening, the results of the MEM are consistent with those 
of the Fourier transformation and thus reproduce the exact results. 

2. In the case with flattening, the MEM yields a smoother free energy density than 
the Fourier transform. Among various default models investigated, some images 
with the least errors do not exhibit flattening. With regard to the question of 
which is the best among such images, further consideration of the systematic 
error is needed to check the robustness of the images. 

This paper is organized as follows. In the following section, we give an overview 
of the origin of flattening. In § |31 we summarize the procedure for the analysis of the 
MEM. The results obtained by use of the MEM are presented in § [l] A summary is 
given in § |S1 

§2. Sign problem and flattening behavior of the free energy 

In this section, we briefly review the flattening phenomenon of the free energy 
density. It is observed when one employs an algorithm in which Z{9) is calculated 
through the Fourier transform. In order to obtain Z{9), we must calculate P{Q) 
with high precision. Although P{Q) is calculated over a wide range of orders by use 
of the set methodj'^i^ the error in P{Q) yields error in Z{9) through the Fourier 
transform. This effect becomes serious in the large 9 region. Here, we use a Gaussian 
P{Q) for our investigation. The Gaussian P{Q) is not just a toy model, but indeed 
it is realized in a variety of models, such as the 2-d U(l) gauge model and in the 
strong coupling limit of CP^~^ models. 

We parameterize the Gaussian P{Q) as 



where, in the case of the 2-d U(l) gauge model, c is a constant depending on the 
inverse coupling constant /?, and V is the lattice volume. Hereafter, V is regarded as a 
parameter and varied in the analysis. The constant A is fixed so that P{Q) = 1- 
The distribution P{Q) is analytically transformed by use of the Poisson sum formula 
into the partition function 



To prepare the mock data, we add noise with variance 6 x P{Q) to the Gaussian 
P{Q). In the analysis, we consider sets of data with various values of 5 and study 
the effects of 5. 



P(Q) = ^exp[-^Q2] 



(2-1) 




n=— oo 



(2-2) 
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Fig. 1. Gaussian topological charge distribution for various lattice volumes. The parameter 5 is 
chosen to be 1/400. 



Figure ^ displays the Gaussian P{Q) for various lattice volumes V. The pa- 
rameter c is fixed at c = 7.42. The corresponding free energy densities, f{6) = 
— ylogZ{9), calculated with Eq. are plotted in Fig. |21 All functions f{9) in 

Fig. 121 fall on a universal curve for 6 < 2.3. For 6 > 2.3, finite size effects are clearly 
observed. As the volume increases, f{9) increases until V < 20, but for V = 30 and 
V = 50, the Fourier transformation does not work . At V = 30, Z{6) becomes nega- 
tive for certain values of 9, and atV = 50, f{9) becomes almost flat for 9 > 2.3. The 
latter behavior causes a fictitious signal of a first-order phase transition at ~ 2.3. 
The mechanism of this flattening is briefly summarized as follows. 

The distribution P{Q) obtained from MC simulations can be decomposed into 
two parts, a true value and an error: 

P{Q) = P{Q) + AP{Q), (2-3) 

where P{Q), P{Q) and AP{Q) denote the MC data, the true value and the error, 
respectively. In order to calculate P{Q) efficiently, which ranges over many orders, 
the set method and the trial function method are conventionally used. In the set 
method, the range of Q is divided into several sets, each of which consists of several 
bins. In each set, the topological charge of the configurations is calculated in order 
to construct the histogram. The trial function method makes the distribution of the 
histogram almost flat. This is useful for reducing the error in P{Q)- Accordingly, 
the error is computed aP^ 

AP{Q) ~ 5P{Q) X P(Q), 
where 6P{Q) is almost independent of Q. Because P{Q) is a rapidly decreasing 
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Fig. 2. Free energy density f{9) for various lattice volumes. f{6) is calculated numerically by 
Fourier-transforming the Gaussian topological charge distribution. 

function, so is the statistical error AP{Q). Thus, the dominant contribution to 
AP{Q) comes from that at Q = 0, and the partition function 2(9) has an error 
AP{0) ^ 5P(0)P(0) « 6P{0). The free energy density for the MC data is approxi- 
mately given by 



fiO) 



V 



log Z{9) « -llog{^P(Q)e*^« + 5P(0)} 



+ 6PiO)}, 



(2-4) 



where f{9) is the true free energy density. The value of e ^-^(^^ decreases rapidly as 
the volume V and/or 9 increase (see Fig. and the magnitude of e~^-^^^^ becomes 
of order of 6P{0) at ~ 9{. Therefore f{9) cannot be calculated precisely, but 
f(9) ~ const, for 9 > 9{. This is the reason for the flattening behavior for V = 50 
in the region > 0f ~ 2.3, as shown in Fig. [3 A drastic reduction of AP{Q) 
is necessary in order to properly estimate f{9) for 9 > 9f. In the case of a large 
volume, however, this is hopeless, because an exponentially increasing amount of 
data is needed. Therefore, we do need some other way to calculate f{9). 



§3. MEM 



In this section we briefly explain the concept of the MEM and the necessary 
procedures for the analysis in order to make this paper self-contained and to define 
the notation. 
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3.1. MEM based on Bayes' theorem 

In an experiment or a numerical simulation, data are always noisy, and the 
number of sets of data is finite. It is in principle impossible to reconstruct the true 
image from such data sets. Hence, it is reasonable to determine the most probable 
image. From Bayes' theorem, the probability that an image f occurs for a given data 
set {D} is given by 

where prob(^) is the probability that an event A occurs and prob(A|i?) is the con- 
ditional probability that A occurs under the condition that B occurs. Moreover, one 
can add to Eq. (|3-1() 'prior information' / about the image f . The information / 
includes that obtained from theoretical restrictions as well as knowledge based on 
previous experiments. With /, Eq. (|3-1|) becomes 

wfir> n Prob(D|f,/)prob(f|J) 

where prob(A, B) is the joint probability that events A and B occur simultaneously. 
The probability prob(f |D, /) is called the posterior probability. When the probability 
prob(D|f, /) is considered as a function of f for fixed data, it is equivalent to the 
likelihood function, which expresses how data points vary around the 'true value' 
corresponding to the true image f. The probability prob(f|/) is called the prior 
probability and represents our state of knowledge about the image f before the 
experiment is carried out. The most probable image satisfies the condition 

^prob(f|D,/)^„ 
of 

Recently, the MEM has been applied to hadronic spectral functions in lattice 
QCD.^'C^ In the analysis of the spectral function A{ijj), the correlation function 
D{t) is given by 

D{t)= dujK{T,u;)A{uj), (3-4) 
Jo 

where K(t, lo) denotes the kernel of the Laplace transform. In lattice theories, the 
number of data points is, at most, of order of 10 — lO'^, due to the finite volume, 
while the number of the degrees of freedom to describe the continuous function A{uj) 
is in the range 0(10^) - ©(lO^). 

Considering theories that include the 6 term, what we have to deal with is 

de^Z{0). (3-5) 

Comparing this with Eq. (|3-4|) . we see the correspondence 

{P{Q) ^ D{t), e-*^^/27r ^ K{t,lo), Z{9) ^ A{uj)}; (3-6) 
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that is, the continuous function 2{9) must be reconstructed from a finite number of 
data for P{Q), which constitutes an ih-posed problem. Given this situation, what we 
would like to do in the present paper is to rely on the MEM, employing the formula 

prob(Z(«)|F«3). /) = prob(Z(^)|/) P^;^^'fJg>g^jy> . (3.7) 

The likelihood function pToh{P{Q)\Z{9), I) is given by 

prob(P(Q)|Z(0),/) = ^-^, (3-8) 

where Xl is a normalization constant, and is defined by 

^2 ^ ^(p(^)(Q) _ P(Q))Cq;q,(p(^)(Q') - P{Q')) (3-9) 
Q,Q' 

in our case, where P^^\Q) is constructed from Z{9) through Eq. (|3-5|) . Here, P{Q) 
denotes the average of a data set {P{Q)}, i.e., 

1 

PiQ) = j^Y.P^'^(Q)' (3-10) 

where A^^ represents the number of sets of data. The matrix represents the 
inverse covariance obtained from the data set {P{Q)}. 

The prior probability pioh{Z{6)\I) is given in terms of the entropy S as 

prob(Z(0)|/,a) = -^, (3-11) 
Xs[a) 

where a is a real positive parameter and Xs{a) denotes an a-dependent normaliza- 
tion constant. The choice of the entropy S is somewhat flexible. Conventionally, the 
Shannon-Jaynes entropy. 



S= dO 



m[ff) 



(3-12) 



is employed, where m{9) is called the 'default model'. The default model m{9) 
must be taken so as to be consistent with prior knowledge. Therefore the posterior 
probability prob(^(6')|P((5), /, a, m) can be rewritten as 

prob(^(e)|P(Q), /, a, m) = (3-13) 

where it is explicitly expressed that a and m are regarded as new prior knowledge 
in prob(Z(0)|P(Q),/,a,m). 

The information / restricts the regions to be searched in image space and helps 
us to effectively determine a solution. We impose the criterion 

Z{9) > 0, (3-14) 
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so that pioh{Z{9) < 0\I,m) = 0. 

In order to obtain the best image of 2(6), we must find the solution such that 
the function 



W 



1 



-X +aS 



is maximized for a given a: 



5Z{e) 



(-;tx' + «5) 



0. 



(3-15) 



(3-16) 



The parameter a plays the role of determining the relative weights of S and ^x^- 
For Q = 0, the solution of Eq. H3-16() corresponds to the maximal likelihood, while 
for a ^ 1, Z{9) = m{6) is realized as a solution. Therefore, care must be taken in 
the choice of m{9). 

3.2. Procedure for the analysis 

In the numerical analysis, the continuous function Z{9) is discretized: 
Z{9) — > Z{9n) = Zn- Therefore, the integral over 9 in Eq. (|3-5|) is converted into a 
finite summation over 9: 



Ng 

E^^. = 1) 



n=l 

Ng 

E 

n=l 



COS &riJ 



Zn (otherwise) 



Ng 

^ = ^ ^ KjfiZfi, 

n=l 



(3-17) 



where j = 1, 2, • • • , Ng. Note that Ng < Ng. Here, Pj denotes P{Q) at Q = j - 1. 
Also, we have used the fact that P{Q) and Z{9) are even functions of Q and 9, 
respectively. Equation H3-12() is also discretized as 



Ng 



n=l 



Zn-mn- Zn log 



(3-18) 



where m{9n) = rUn- 

We employ the following procedure for our analysis.^'^ 

1. Maximizing W for fixed a: 

In order to find the image that maximizes W in the functional space of Zn 
for a given a, we calculate Eq. (|3-16p . This yields 



-alog— = ^inCr.'SPj, {n = l,2,--- ,Ne) 



where Cij is the covariance matrix in Eq. (|3-9|) . and 



(3-19) 



(3-20) 
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Solving for Z„ is non-trivial, because Ng ~ 0(10^) — 0(10^) and Ng ~ 
O(IO^) in our case. It is convenient to use the SVD and Newton's method. 
Details are given in Appendix 1X1 The solution to Eq. H3-19|) is called 

2. Averaging Z^'^: 

Since a is an artificial parameter, the final image that we obtain must have 
no a dependence. The a-independent final image can be calculated by averaging 
the image Z^ ^ with respect to the probability. The expectation value of Z^ is 
given by 

Zn = J [dZ]Zn prob(Z„|P(Q),I,m), (3-21) 

where the measure [dZ] = Y[n dZn/ y/Z^ is used.l^ Using the laws of the total 
probability and the conditional probability, we obtain 

Zn = J[dZ]Zn J da prob(Z„|Q;, P{Q),I, m)proh{a\P{Q) , I , m) 

^ j da pmh{a\P{Q),I,m)Z^''l (3-22) 

Further application of the total probability, the conditional probability and 
Bayes' theorem to prob(Q|P((5), I, m) yields 

Zn = ^j da 4") exp|yl(a) + (3-23) 

where is a normalization constant and A{a) = ^ log f^!^x^ ■ Here, the 
values Afc are eigenvalues of the real symmetric matrix in 9 space, 



1 f^jnc_ ,^ 
2^ dz^dzj 



(3-24) 



In deriving Eq. (|3-23j) . we have assumed that the probability prob(Z„|a, P{Q), I, m) 
has a sharp peak around Z^\ and W{Z^'^^) denotes the value of W for which 
Zn = Z^\ The derivation of Eq. H3-23() is given in Appendix IbI 

In averaging over a, we determine a range of a so that pToh{a\P{Q), I, m) > 
^xpioh{a\P{Q),I,m) holds, where pioh{a\P{Q),I^m) is maximized at a = a. 
The normalization constant is chosen such that 



/■«max 

/ dapmh{a\P{Q),I,m) = 1. (3-25) 



3. Error estimation: 

One of the advantages of the MEM is that it allows us to estimate the 
error of constructed images. Because the errors in Zn at different points could 
be correlated, the error estimation should be performed over some range in 
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6 space. This range is determined systematically by analyzing the Hessian 
matrix in 9 space, 



(3-26) 



The uncertainty of the final output image is calculated 

{{5Znf)^ J da{i6Z^^^f)pmhia\P{Q),I,m), (3-27) 



where 



m ) I 



j[dZ] /g, dend9ni5Zn5Zn'Woh{Z.m\P{Q),I, m, a) 

(3-28) 



f[dZ] /g, d9nd6n'proh{Zm\PiQ), I , m, a 



n 



Z=Z(°''> 



/g) dOndOn' Je "V dZndZn' 

See Appendix El for details. 

In the procedure described in this section, the uniqueness of the final image 
is guaranteed for a 7^ 0. This requires the conditions that the image Z^°^\6) be 
positive definite and the kernel {K^)nj be real. These are indeed satisfied, as shown 
in Appendix 10 • 

§4. Results 

In this section, we present the results of the MEM analysis of the data for P{Q). 
To prepare data for the analysis, we added to P{Q) Gaussian noise generated with 
the variance 5 x P{Q) for each value of Q. This way of adding noise is based on the 
procedure which was employed to calculate P{Q) in the simulations of the CP^~^ 
model.!^ This yields error which amounts to almost constant portion of P{Q) for 
each Q, as mentioned above Eq. ()2-4p . The parameter 6 was varied from 1/10 to 
1/600, and we present the results for 5 = 1/400. A set of data consists of P{Q) 
along with the errors from Q = to Q = Ng — 1. Employing such sets of data, 
we calculated the covariance matrices in Eq. ()3-9|) with the jackknife method. We 
have checked whether the outcome is stable by varying the value of N^i in the range 
10 < Nfi < 60 and found that this is the case for 30 < N^- We present here the 
results for Nd = 30. 

For the default model m{9) in Eq. I|3-12jl . we studied various cases: (i) m{6) = 
const., (ii) m{e) = (sin(e/2)/(^/2))^ = m^trg{0) , (iii) m{e) = e^-^-^l^ )■ In 
case (i), we studied several values, m{6) = 0.1,0.3 and 1.0. We present the results 
for m{9) = 1.0 as a typical case. Case (ii) corresponds to the strong coupling limit 
of the CP^~^ model. Case (iii) is the Gaussian case. The parameter 7 was varied 
in the analysis. 

The number of degrees of freedom in 6 space, Ng, is larger than that of the 
topological charge, Nq. The number Nq was chosen so as to satisfy P{Q) > 10^'^'^ in 
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Fig. 3. Z'^"\6) for the data without flattening. Here, V — 12. The defauh model m{6) is chosen 
to be the constant 1.0 and the Gaussian function with 7 — 0.8. 



the non-flattening case and P{Q) > 10 in the flattening case, and it varies from 
3 to 8, depending on V. The covariance matrix appears in Eq. (|3-9|) : 

cq,q' = ^^^^^ _ Y.{p^'^(Q) - pmip^'KQ') - piQ')), (4-1) 

where P^^\Q) denotes the l-th data of the topological charge distribution and P{Q) 
is the average Eq. p-lUj) . The inverse covariance matrix is calculated with such 
precision that the product of the covariance matrix and its inverse has off-diagonal 
elements that are at most C'(10~'^^). 

The number of the other degrees of freedom, Nq, was varied from 10 to 100, and 
it was found that the results are stable for 25 < Ng. In the following results, Nq is 
set to be 28. Note that in order to reproduce 2(9), which ranges over many orders, 
the analysis must be performed with quadruple precision. 

4.1. MEM analysis of the data without flattening 

Before discussing to what extent the flattening behavior of the free energy is 
remedied, we discuss the case without flattening. It is a non-trivial question whether 
the MEM is effective in the application considered here. 

The data for V < 20 were used in the analysis. For such data, no flattening 
behavior is observed (see Fig. I^J. For the present, we concentrate on the data for 
V = 12. For this value of V, two types of default models were employed, the constant 
default model, m{9) = 1.0, and the Gaussian one with 7 = 0.4 — 1.0. 

The maximal image ofZ{9) for a given a was calculated using Eq. (|3-16p . 
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Fig. 4. prob(a|P(Q), J, m) for the data without flattening. Ifere, V — 12. The default model is 
chosen to be the constant m{6) — 1.0 and the Gaussian function with 7 — 0.6, 0.8 and 1.0. 
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Fig. 5. Z^"\9) for the data with flattening. The default model m{6) is chosen to be the constant 
1.0 and the Gaussian function with 7 = 6. Comparing with the result of the Fourier transform 
(circles), the result of the MEM for certain values of a (a ~ 300 for m{6) — 1.0 and a ~ 2000 
for the Gaussian) approximately reproduces that of the exact partition function Eq. 12-21 1 (filled 
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Figure 01 displays Z^°'\6) calculated in this way for various a (0 < a < 1000) in the 
cases that m{6) is the constant 1.0 and the Gaussian function with 7 = 0.8. It 
is found that there is almost no discernible a dependence of Z^°'\6) and that the 
images approximately agree with the result of the Fourier transform, and thus with 
the exact partition function 2^pois(^)- 

In order to determine which 2^^") (6) is the most probable, we calculated the pos- 
terior probability prob(a|P(Q), /, m) in Eq. (|l-i-2l-ij) . The data for prob(a|P(Q), /, m) 
were fitted by smooth functions and normalized such that Eq. H3-25() holds. The re- 
sults are plotted in Fig. 0] For both cases, pii:oh{a\P{Q), I,m) has a peak at a small 
value of a; for m{6) = 1.0, the peak is located at a ~ 12.0, while in the Gaussian 
case it is located at a ~ 30.0 (7 = 0.6), a ~ 50.0 (7 = 0.8) and a ~ 120.0 (7 = 1.0). 
Because the functions Z^^^O) do not depend on a in the region around the peak, 
the integrals we must evaluate to obtain the averaged image Z(0) in Eq. H3-23() are 
trivially simple, and the functions Z{0) are approximately in agreement with the 
exact one. 

For V = 8 and 20, similar analyses were carried out. We find that the char- 
acteristics of Z^'^\9) and prob(a|P(Q), I, m) stated above, namely, that Z^°'\9) is 
almost independent of a for not so large values (a < 1000) and prob(a|P(Q), /, m) 
has a peak at a small value of a, are also observed for V = 8 and 20. Therefore, we 
obtain the same results for Z{6). More generally, in the non-fiattening case, where 
the Fourier transform works well, the fact that the image obtained using the MEM 
is consistent with the result of the Fourier transform can be understood by care- 
fully considering the equations used in the SVD. This is investigated analytically in 
Appendix IdI 

The detailed procedure for estimating the error 6Z{9) is discussed in the next 
subsection, and its results are given at the end. 

4.2. MEM analysis of the data with flattening 

Let us now turn to the case with fiattening. Unlike the case without fiattening, 
in this case the images of Z(9) display behavior that differs greatly from those of 
the Fourier transform. We fix the volume to y = 50 for the time being. Figure [3 
displays Z^°'\9) calculated with m{9) given by the constant 1.0 and the Gaussian 
form with 7 = 6, which are images determined by maximizing Eq. H3-15|l for each 
a. For each of the defaults, the maximal images are free from flattening, and at 
least for a certain value of a, there exists a Z^"'^ (9) which is in reasonable agreement 
with the exact one, Zpois{9). This also holds in the case that m(0) is Gaussian with 
7 = 4, 5 and 7, although the value of a for which we flnd the best agreement with 
^pois(^) depends on 7. In the case of ?Tistrg(^)j however, we find no agreement, even 
when a is varied from 0{1) to 0(10^). 

The posterior probability prob(o;|P(Q), I, m) was calculated with Eq. (|B-4|) ap- 
pearing in Eq. (|3-23|) . Figure displays the behavior of W{Z^°'^) and A{a) for various 
m(9). We flnd that as a increases, W{Z^°''>) decreases almost linearly, depending 
strongly on m{9), while A{a) increases with rather weak a dependence. The sum 
of W{Z^°'^) and A{a) gives prob(a|P(Q), /, m), and the balance between the two 
determines the location of the peak of proh{a\P{Q),I, m), if it exists. This is shown 
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Fig. 6. for various m{e) and for V = 50. W{2^"^) is plotted for mstrg(6') and for the 

Gaussian default models with 7 = 4, 5, 6 and 7. A{a) appearing in Eq. 13-2311 is also plotted on 
the same scale. Because A{a) does not depend strongly on m{6), A{a) for the Gaussian default 
model with 7 = 6 is displayed as a reference. 



in Fig. 13 

The averaged image 2 (6) was calculated using Eq. ()3-'23() . Integrating over a, 
the data for pToh{a\P{Q), I,m) were fitted by a smooth function. Then, the fitting 
function was normalized. Figure El displays 2(6) for various m{6). In the case of 
a Gaussian m(9), the parameter 7 is varied from 4 to 8. We find reasonably good 
agreement between the function 2(9) for 7 = 5 and 6. This is due to the fact that 
for these cases, the best value of a for 2^"\9) is approximately equal to the location 
of the peak of pmh{a\P{Q), I, m); for 7 = 6 the two values are nearly equal(Ri 2000), 
while for 7 = 5 they differ slightly (~ 1000 for the former and ~ 650 for the latter). 
In other words, these images could occur with high probability. In the m{9) = 1.0 



Table I. Z{9) and error at = 3.07 for the various default models depicted in Fig. |H| The exact 
value, 2pois(3.07), is 1.554 x 10"'^. 

m{e) 2(3.07) 5Z{3.07) 

strong 2.33x10"" 3.5x10"* 

Gauss 7 = 4 2.29x10"^ 7.4x10"* 

Gauss 7 = 5 8.99x10"* 1.9x10"'^ 

Gauss 7 = 6 3.38x10"^ 1.7x10"^ 

case, although the best image is in good agreement with 2pois{9) for a = 300, as 
shown in Fig. El the peak of proh(a\P{Q), I) is located in the region where a < 40.0. 
This large difference in a leads to a large deviation of 2(9) from the correct one, 
^pois(^)- Similarly, for mstrg(^)) the image 2^"\9) in agreement with ^pois(^) occurs 
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Fig. 7. proh {a\P{Q), I ,m) for various m{9). V is fixed to 50. 



with a very low probability, and consequently Z{0) deviates greatly from Zpois(^)- 
This is obvious, because Z^°'\6) never agrees with Zpois(^) for a = 0(1) to 0(10^). 
(Note that the errors are not included in the figure.) 

One of the advantages of the MEM analysis is that it allows estimation of the 
error according to the probability that the images are realized. In fact, the ob- 
tained images are meaningless unless their errors are evaluated. We use the formula 
Eq. (|3-28|) for the error estimates. The error 5Z{9n) of Z[9n) is estimated by in- 
tegrating 6 in Eq. H3-28() over the range O = 0^-6/2 and 0n+6/2! where 9m is the 
abscissa in the Gauss-Legendre A^-point (A^=100) quadrature formula for the range 
< ^ < vr. Figure ini displays typical behavior of the error as a function of the block 
size at ^ = 3.07 for Gaussian m{9) with 7 = 5.5, where the block size is defined by 
6+1. In order to show how large the errors of Z{9) in Fig.lHJare in the region near 



Table II. Z{6) aX 6 = 2.30 and 6 = 3.07 for various volumes V . The default model is chosen to be 
the Gaussian form with 7. The exact values Z^oisiO) are also listed. 

V 7 (Gaussian defauH) £:poi42.30) .3(2.30) Zp^^im) ^(3.07) 



"8 Ol 2.493x10"' 2.424(3) X 10"' 1.407x10"' 1.48491(6) x 10"' 

12 0.8 1.155x10"^ 1.145(3)xl0"2 3.752x10"^ 3.762(1) xlO"^ 

20 1.6 2.676x10"^ 2.583(2)xl0"^ 2.697x10"^ 2.946(6) xlO"^ 

30 3.4 4.372x10"^ 4.12(2) xlO"^ 1.023x10"" 8.3(3) xlO"* 

50 5.5 1.169x10"" 1.20(4) xlO"* 1.554x10"'^ 2.5(1.6) xlO"^ 



9 = TT, Table nihsts the values of Z{9) and their errors at 9 = 3.07. Together with 
the results displayed in Fig. |H1 we find that the more Z{9) deviates from the exact 
one, the larger the magnitude of the relative error becomes. 
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Among the default models we have investigated, it turns out that the errors 
for the Gaussian form with 7 = 5.5 are the smaUest. Figure EH plots the partition 
function with the estimated errors for the Gaussian default model with 7 = 5.5. 
The result is consistent with 2^pois(^) and does not exhibit flattening. For the other 
default models, m{9) = 1.0 and nistrgiO), it is found that the errors are too large to 
obtain reasonable images 2(0). 
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= 50. Compared to the result of the Fourier transform (circles), a remarkable improvement 
is clearly seen. 



We applied the same procedure to the data for other volumes. Table ^ gives 
Z{9) and 5Z{0) at 6 = 2.30 and 3.07 for various volumes and default models. Each 
of the two values of 6 was chosen as a reference, where the latter is close to vr and 
the former is near 9{. The default models listed in Table Ull give almost minimal 
errors among the defaults we have investigated for each volume. These were used 
to calculate the free energy density plotted in Fig. 1111 We find by comparing Fig. 
that the flattening behavior is no longer observed. 

§5. Summary 

We have considered lattice field theory with a 9 term. When studied numerically, 
this theory suffers from the complex Boltzmann weight problem, or the sign problem. 
As an attempt at an approach that differs from the conventional procedure, which 
employs the numerical Fourier transform of -P(Q), we have applied the MEM in order 
to reconstruct the partition function Z{9). In the MEM analysis, the image of Z{9) 
is calculated probabilistically, and the error is estimated as the uncertainty of the 
image. We have employed the Gaussian P{Q) as a test, because its Fourier transform 
can be computed analytically. We found that for the data without flattening, the 
results of the MEM analysis are consistent with those of the Fourier transform, 
while for those with flattening, the MEM reproduces reasonable images that exhibit 
no flattening by reducing the error contained in the data of P{Q) (5 ~ 1/400). 
Comments are in order. 
1. During the analysis, the condition Z{9) > in Eq. H3-14() was imposed as prior 
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information /. This plays an important role in searching for the maximal image 
in the case with flattening. This condition could yield results that differ from 
those found in the Fourier method, where 2{9) could become negative due to 
large error. 

It is important what default model is chosen for the analysis. A criterion for 
this choice is the magnitude of the errors of the averaged image, which are 
determined according to the probability. We have investigated various default 
models and found that 7 = 3.4 for V = 30 and 7 = 5.5 for V = 50 are the best 
among those which we have investigated in the flattening case. The purpose 
of the present paper is to check the feasibility of the application of the MEM 
to the case with the 6 term, and therefore we did not consider a large number 
of Gaussian defaults with different values of 7. There might be some better 
default models than those we have investigated here. 

The final image Z{6) depends on 6, which controls the magnitude of the error 
of the mock data P{Q). The parameter 5 directly affects the covariance matrix 
and indirectly influences the error 5Z{9) of the image, which is the uncertainty 
of the image. Although we employed the least uncertainty as a criterion, it is 
still unclear to what extent the obtained best image is close to the true one; 
i.e., there is a systematic error in Z(9), not reflected in 5Z{6). This is related 
to the freedom that we have in choosing the default model. Various default 
models may be allowed within the uncertainty related to 5. To check this point 
we may as well use the mock data that causes true flattening behavior of the 
free energy and study the interference of such an effect with the data used in 
the present study. This will be studied in a forthcoming paper. Furthermore, 
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some objective analysis of the parameter inference is needed with regard to the 
regularization of the unfolding problemP^ 

4. As the volume increases, the flattening problem becomes more severe, and 
higher precision of the computations becomes necessary. This is because a 
more precise calculation is required to obtain the inverse of the covariance ma- 
trix of P{Q) as the volume increases. In the case y = 8, for example, the 
Newton method with double precision is sufficient. For V = 50, however, we 
need quadruple precision. 

5. Because the flattening behavior is inherent in the Fourier procedure, no matter 
what model one treats, we have used the Gaussian P{Q) as a first attempt. The 
next step is to apply the MEM to more realistic models, such as the CP^~^ 
model and QCD, and to investigate its feasibility for them. The MEM analysis 
might also be applicable to some other models with the sign problem. It may be 
worthwhile to study whether it is effective for treating theories such as lattice 
field theory with a finite density. 
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Appendix A 

^VT* and the Newton Method 

In order to obtain the maximal image of Z{9) for fixed a, we must solve the 
following equation [see Eq. (|3-1H|) ]: 

-^^^^^=T.^3r.C7^^P^- (A-l) 

For later convenience, we introduce a new parameter a„ defined by 

a„ = log— . (A-2) 

We regard and Pj as Nq- and A'^^-dimensional vectors, respectively (n = 
1, 2, • • • ,Ne and j = 1, 2, • • • , Nq). In our case, Ne ~ 0(10^) - 0(10^) and Nq ~ 

0(10°). Therefore, it is non-trivial to find satisfying Eq. (|3-16|) . This task 
is made considerably simpler by employing the SVD. The transpose of K^, is 
decomposed as 

iK%j = {USV%, (A-3) 

where U is an Ng x Nq matrix satisfying U^U = 1, V is an NqX Nq orthogonal matrix, 
and S is an Nq x Nq diagonal matrix, = ^i5ij. The eigenvalues ^j, which are 



20 M. Imachi, Y. Shinno and H. Yoneyama 

positive semi-definite, are called the 'singular values' of -fC*. They could be arranged 
by appropriate permutations such that > ^2 > • • • > Cns > Cn^+i = • • • = i,Nq = 0, 
where 

Ns = rank(i^*) < Ng. (A-4) 

Following Bryanf^ the first Ng column vectors of U = {ui,U2, ■ ■ ■ ,itAr,) construct 
a basis in 'singular space', which is a subspace of the A'^g-dimensional space whose 
basis is {ui,U2, • " with Ui = {uii.U2i, ■ " lUNeiY- can then place the 

vector a in singular space 



a = ^Aitij. (A-5) 

i=l 

Substituting Eqs. (|A-3|) and (|A-5|) into Eq. (|A-lj) . we find 

-a KUni = Y,iU^V%iCr.^6Pj. (A-6) 

Use of [/*[/ = 1 then yields 

-aXi = Y{SV%,Cri^6Pi. (A-?) 

To solve Aj, the Newton method is employed. For each iteration, an increment 6X is 
given by 



E 



aSik - y^^{SV^)ij C-i ^ KinZn Unk 



k 

or in the matrix notation. 



-X5X = aX + Y5P, (A-9) 

where A and 6P are A'^g-dimensional vectors, and X and Y are Ng x Ng and Ng x Nq 
matrices, respectively: 

X = al + SV^C-^KZU, Y = SV^C'^. (A-10) 

We then need to calculate the inverse matrix of X to obtain 5X. Note that Ng = Nq 
holds for all the cases we have considered in the present paper. 

Appendix B 

Derivation of Eqs. iV- 2'il^ and 



1. Equation (|3-23() is derived as follows. 

The probability ^Toh{Zn\P{Q)^I.,ra) can be rewritten by use of the law of 
the total probability prob(j4) = J dB prob(A, B) as follows: 



^Toh{Zn\P{Q), I,m) = J da prob(Z„, a\P{Q),I, m) 



da prob(Zn|a, P{Q),I, m)prob(a|P(Q), I, m). 

(B-1) 
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In the second line, the definition of the conditional probability was used. 
Substituting Eq. HB-1|) into Eq. (|3-21() . we obtain 

2n = J[dZ]Zn J da proh{Zn\a,P{Q),I,m)proh{a\P{Q),I,m) 
da pvoh{a\P{Q), I,m) / [dZ]Zn pTob{Zn\a, P{Q), I ,m) 



j daY>Toh{a\P{Q),I,m)Z^^\ (B-2) 

where we have assumed that the probability prob(2^„|Q, P{Q), I, m) has a sharp 
peak around 2^n"^ • Utilizing Bayes' theorem, the law of the total probability, and 
the definition of the conditional probability, the probability prob(a|P((5), I-, m) 
can be rewritten as 

ui ir>tr^\ T \ prob(P(Q)|a,/,m)prob(a|/,m) 
prob(Q|-P((y), i, m) = 



prob(P(Q)|/,m 
prob(a|/, m) 



prob(P(Q)|/,m) 

prob(a|/, m) 
prob(P(Q)|/,m) 



[dZ] proh{P{Q),Zn\a,I,m) 

[dZ] prob(P(Q)|Z„,a,/,m) 
xprob(Z„|a, I, m) 



oc prob(a|/,m)y [dZ]^-^-^, (B-3) 

where Eq. H3-13() has been used and irrelevant factors, such as prob(P(Q)|/, m), 
have been ignored. 

Expanding W{Z) around {2^^°^} up to second order, we can perform the 
Gaussian integration over configurations of {Z}: 

prob(a|P(Q),/, m) 

n,n' 

OC probfall, m) expl - log — — h WiZ^'^^) I 

^ k ^ + ^k J 

= prob(a|/, m) expl yl(a) + W{Z^°''^)\ , (B-4) 



where 5Zn = Zn — Z^^ . Irrelevant constants have been ignored here, as be- 
fore. The values are eigenvalues of the real symmetric matrix in Q space 
defined in Eq. (|3-24|) . The prior probability prob(a|/, m) is conventionally cho- 
sen to be either the Laplace rule [prob(a|/, m) =const.] or the Jeffrey rule 
[prob(a|/, m) = 1/a]. Because the integral in Eq. ()B-2|) is insensitive to the 
choice of prob(a|/, m) as long as prob (a| P(Q), J, m) has a sharp peak, we em- 
ploy the Laplace rule for simplicity.^ 
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Substituting Eq. (|B-4|) into Eq. HB-2|) . we can obtain the equation for Z^, 



2. Equation ()3-28|) is derived as follows. 

The uncertainty of the final output image Zn is calculated as follows P^'^ 

{{6Znf) ^ J da((<54"))2)prob(a|P(Q),/,m), (B-G) 

where 

{{SZ^'^^f) = /[^^] le d0nden'6Zn6Zn,proh{Zm\P{Q), I , m, a) 
J[dZ] jQd9nd9n'proh{Zm\P{Q),I,m,a) 

Using /[dZ] pToh{Zm\PiQ),I,rn,a) = 1 and inserting Eq. we obtain 

When e^*-^^ is expanded around {Z^"^^} up to second order, and the integral 
over Zn is performed, Eq. (|3-28|) is derived as 



{{SZj-^f) = J^J^, dO^de^, I [dZ\6Z^6Z, ^ 



dM^n'b^^ 1 . (B-9) 



Appendix C 

Uniqueness of the Maximum of W 



Following Asakawa et. al.,^ let us check that a unique maximum of W exists 
for a ^ (> 0). In our case, the kernel is given by that of the Fourier transform. 
The curvature of W is given by 



-a 



Introducing any A'g-dimensional real vector 0), let us calculate 



z^y^^dZrndzJ''- ^y^^ 



mn 



7 KjmC-^^Kjn 



Vn. (C-2) 
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1. a = case 

When a = 0, Eq. (|C-2|) becomes 

Q2, dZ ~ ~ KimC-^ KjnUn- (C'3) 

mn " mn 

Because the covariance matrix C is symmetric, C is diagonahzed by an Nq x A'^g 
orthogonal matrix R as 

Also, defining the Nq x A'^g matrix 

K = R*K 
and the A'^g-dimensional real vector 

y ^ Ky, (C-4) 

Eq. ()C-3|) becomes 

5;,„^!^,„ = _5;|<„. ,c.5) 

This could vanish only for = 0, and this is realized for non-trivial vectors 
because Eq. HC-4|1 asserts that the dimension of the solution vector space is 

Ne - rankK > iVg - iVg > 0. 

In the a = case, therefore, there are multiple maxima of W . 

2. a 7^ case 

For Q / 0, Eq. ()C-2|) becomes 

^ d'^W sr-yn ST y'i in a\ 



Because Zn > 0, the curvature of W becomes negative definite: 

^y^'dZmdzJ''^^- 
Therefore, the entropy term is essential to the uniqueness of the maximum. 

Appendix D 

Comparison with the Fourier Method 



In the case of no flattening, we show that the analysis in the Newton method 
leads to the same result as that obtained using the Fourier method. This holds for 
a = and a ^ (but only small q > 0). Note that in the Fourier method, 

P = KZ (D-l) 
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is successfully inverted in the case without flattening. This is because P{Q) is a 
rapidly decreasing function, and 2(9), which is given by the Fourier transform, is 
smooth enough that the contribution from higher Q can be ignored. 

1. a = case 

Let us first consider the a = case. For a = 0, Eq. ()A-9p reduces to 



-Xo6X = Y6P, (D-2) 

where 

Xo = SV^C-^KZU. (D-3) 
When Xq is regular, the increment 6X becomes 

6X = -X-^Y6P. (D-4) 

When the integrations converge, i.e., when 6X = 0, in the Newton method, we 
find 

Xq^Y6P = 0. (D-5) 
Because the Nq x Nq matrix Xq^Y is regular (this can be checked numerically), 

5P = KZ - P = (D-6) 

is a unique solution. 

Equation ()D-6|) gives Nq equations: 

Ng Ng 

KinUln exp 

(E^"A)=^* (i = l,...,iV,). (D-7) 

71=1 j = l 

Thus, one obtains a unique solution for Xi{i = 1, . . . ,Nq) for a given default 
nin- This turns out to give a Zn that is equivalent to that found in the Fourier 
method. 

2. a ^ case (small a) 

When a 7^ 0, we use Eq. <\A-9^ in the Newton method. For small a, this is 
approximated by 

X6X = -Y6P. (D-8) 
By using the regularity of X, 6X is given by 

6X = -X-^Y6P. (D-9) 

When the integration converges, 6X = 0, the regularity of X^^Y implies 

6P = 0, (D-10) 



which is the same as Eq. (|D-6|1 . Therefore we obtain a unique solution in this 
case too. 
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We thus find that for a = and/or for small values of a, the Newton method leads 
to the same result as that obtained using the Fourier method. Moreover, if the 
probability prob(a|P(Q), /, m) dominates for a f» 0, then the averaged image Z„ is 
also in good agreement with that obtained from the Fourier method. Actually, this 
is true for the cases without flattening, as discussed in § 14.11 
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